skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "The_DAMIC-M_collaboration"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We present WonderWorld, a novel framework for interactive 3D scene generation that enables users to interactively specify scene contents and layout and see the created scenes in low latency. The major challenge lies in achieving fast generation of 3D scenes. Existing scene generation approaches fall short of speed as they often require (1) progressively generating many views and depth maps, and (2) time-consuming optimization of the scene representations. Our approach does not need multiple views, and it leverages a geometry-based initialization that significantly reduces optimization time. Another challenge is generating coherent geometry that allows all scenes to be connected. We introduce the guided depth diffusion that allows partial conditioning of depth estimation. WonderWorld generates connected and diverse 3D scenes in less than 10 seconds on a single A6000 GPU, enabling real-time user interaction and exploration. Our interactive demo, full code, data, and software can be found at https://kovenyu.com/WonderWorld/ 
    more » « less
    Free, publicly-accessible full text available June 11, 2026
  2. We introduce RandAR, a decoder-only visual autoregressive (AR) model capable of generatng images in arbitrary token orders. Unlike previous decoder-only AR models that rely on a predefined generation order, RandAR removes this inductive bias, unlocking new capabilities in decoder-only generation. Our essential design enabling random order is to insert a "position instruction token" before each image token to be predicted, representing the spatial location of the next image token. Trained on randomly permuted token sequences -- a more challenging task than fixed-order generation, RandAR achieves comparable performance to conventional raster-order counterpart. More importantly, decoder-only transformers trained from random orders acquire new capabilities. For the efficiency bottleneck of AR models, RandAR adopts parallel decoding with KV-Cache at inference time, enjoying 2.5x acceleration without sacrificing generation quality. Additionally, RandAR supports in-painting, outpainting and resolution extrapolation in a zero-shot manner.We hope RandAR inspires new directions for decoder-only visual generation models and broadens their applications across diverse scenarios. Our project page is at https://rand-ar.github.io/. 
    more » « less
    Free, publicly-accessible full text available June 11, 2026
  3. As the field of representation learning grows, there has been a proliferation of different loss functions to solve different classes of problems. We introduce a single information-theoretic equation that generalizes a large collection of mod- ern loss functions in machine learning. In particular, we introduce a framework that shows that several broad classes of machine learning methods are precisely minimizing an integrated KL divergence between two conditional distributions: the supervisory and learned representations. This viewpoint exposes a hidden information geometry underlying clustering, spectral methods, dimensionality re- duction, contrastive learning, and supervised learning. This framework enables the development of new loss functions by combining successful techniques from across the literature. We not only present a wide array of proofs, connecting over 23 different approaches, but we also leverage these theoretical results to create state-of-the-art unsupervised image classifiers that achieve a +8% improvement over the prior state-of-the-art on unsupervised classification on ImageNet-1K. We also demonstrate that I-Con can be used to derive principled debiasing methods which improve contrastive representation learners. 
    more » « less
    Free, publicly-accessible full text available April 25, 2026
  4. Recent approaches have shown promises distilling diffusion models into efficient one-step generators. Among them, Distribution Matching Distillation (DMD) produces one-step generators that match their teacher in distribution, without enforcing a one-to-one correspondence with the sampling trajectories of their teachers. However, to ensure stable training, DMD requires an additional regression loss computed using a large set of noise-image pairs generated by the teacher with many steps of a deterministic sampler. This is costly for large-scale text-to-image synthesis and limits the student's quality, tying it too closely to the teacher's original sampling paths. We introduce DMD2, a set of techniques that lift this limitation and improve DMD training. First, we eliminate the regression loss and the need for expensive dataset construction. We show that the resulting instability is due to the fake critic not estimating the distribution of generated samples accurately and propose a two time-scale update rule as a remedy. Second, we integrate a GAN loss into the distillation procedure, discriminating between generated samples and real images. This lets us train the student model on real data, mitigating the imperfect real score estimation from the teacher model, and enhancing quality. Lastly, we modify the training procedure to enable multi-step sampling. We identify and address the training-inference input mismatch problem in this setting, by simulating inference-time generator samples during training time. Taken together, our improvements set new benchmarks in one-step image generation, with FID scores of 1.28 on ImageNet-64x64 and 8.35 on zero-shot COCO 2014, surpassing the original teacher despite a 500X reduction in inference cost. Further, we show our approach can generate megapixel images by distilling SDXL, demonstrating exceptional visual quality among few-step methods. 
    more » « less
  5. Realistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge. Unlike unconditional or text-conditioned dynamics generation, action-conditioned dynamics requires perceiving the physical material properties of objects and grounding the 3D motion prediction on these properties, such as object stiffness. However, estimating physical material properties is an open problem due to the lack of material ground-truth data, as measuring these properties for real objects is highly difficult. We present PhysDreamer, a physics-based approach that endows static 3D objects with interactive dynamics by leveraging the object dynamics priors learned by video generation models. By distilling these priors, PhysDreamer enables the synthesis of realistic object responses to novel interactions, such as external forces or agent manipulations. We demonstrate our approach on diverse examples of elastic objects and evaluate the realism of the synthesized interactions through a user study. PhysDreamer takes a step towards more engaging and realistic virtual experiences by enabling static 3D objects to dynamically respond to interactive stimuli in a physically plausible manner. See our project page at this https URL. 
    more » « less
  6. null (Ed.)
    We present a passive non-line-of-sight method that infers the number of people or activity of a person from the observation of a blank wall in an unknown room. Our technique analyzes complex imperceptible changes in indirect illumination in a video of the wall to reveal a signal that is correlated with motion in the hidden part of a scene. We use this signal to classify between zero, one, or two moving people, or the activity of a person in the hidden scene. We train two convolutional neural networks using data collected from 20 different scenes, and achieve an accuracy of 94% for both tasks in unseen test environments and real-time online settings. Unlike other passive non-line-of-sight methods, the technique does not rely on known occluders or controllable light sources, and generalizes to unknown rooms with no recalibration. We analyze the generalization and robustness of our method with both real and synthetic data, and study the effect of the scene parameters on the signal quality. 
    more » « less
  7. null (Ed.)
    We recover a video of the motion taking place in a hidden scene by observing changes in indirect illumination in a nearby uncalibrated visible region. We solve this problem by factoring the observed video into a matrix product between the unknown hidden scene video and an unknown light transport matrix. This task is extremely ill-posed as any non-negative factorization will satisfy the data. Inspired by recent work on the Deep Image Prior, we parameterize the factor matrices using randomly initialized convolutional neural networks trained in a one-off manner, and show that this results in decompositions that reflect the true motion in the hidden scene. 
    more » « less
  8. Context.3C 84 is a nearby radio source with a complex total intensity structure, showing linear polarisation and spectral patterns. A detailed investigation of the central engine region necessitates the use of very-long-baseline interferometry (VLBI) above the hitherto available maximum frequency of 86 GHz. Aims.Using ultrahigh resolution VLBI observations at the currently highest available frequency of 228 GHz, we aim to perform a direct detection of compact structures and understand the physical conditions in the compact region of 3C 84. Methods.We used Event Horizon Telescope (EHT) 228 GHz observations and, given the limited (u, v)-coverage, applied geometric model fitting to the data. Furthermore, we employed quasi-simultaneously observed, ancillary multi-frequency VLBI data for the source in order to carry out a comprehensive analysis of the core structure. Results.We report the detection of a highly ordered, strong magnetic field around the central, supermassive black hole of 3C 84. The brightness temperature analysis suggests that the system is in equipartition. We also determined a turnover frequency ofνm = (113 ± 4) GHz, a corresponding synchrotron self-absorbed magnetic field ofBSSA = (2.9 ± 1.6) G, and an equipartition magnetic field ofBeq = (5.2 ± 0.6) G. Three components are resolved with the highest fractional polarisation detected for this object (mnet = (17.0 ± 3.9)%). The positions of the components are compatible with those seen in low-frequency VLBI observations since 2017–2018. We report a steeply negative slope of the spectrum at 228 GHz. We used these findings to test existing models of jet formation, propagation, and Faraday rotation in 3C 84. Conclusions.The findings of our investigation into different flow geometries and black hole spins support an advection-dominated accretion flow in a magnetically arrested state around a rapidly rotating supermassive black hole as a model of the jet-launching system in the core of 3C 84. However, systematic uncertainties due to the limited (u, v)-coverage, however, cannot be ignored. Our upcoming work using new EHT data, which offer full imaging capabilities, will shed more light on the compact region of 3C 84. 
    more » « less
  9. Context.The nearby elliptical galaxy M87 contains one of only two supermassive black holes whose emission surrounding the event horizon has been imaged by the Event Horizon Telescope (EHT). In 2018, more than two dozen multi-wavelength (MWL) facilities (from radio toγ-ray energies) took part in the second M87 EHT campaign. Aims.The goal of this extensive MWL campaign was to better understand the physics of the accreting black hole M87*, the relationship between the inflow and inner jets, and the high-energy particle acceleration. Understanding the complex astrophysics is also a necessary first step towards performing further tests of general relativity. Methods.The MWL campaign took place in April 2018, overlapping with the EHT M87* observations. We present a new, contemporaneous spectral energy distribution (SED) ranging from radio to very high-energy (VHE)γ-rays as well as details of the individual observations and light curves. We also conducted phenomenological modelling to investigate the basic source properties. Results.We present the first VHEγ-ray flare from M87 detected since 2010. The flux above 350 GeV more than doubled within a period of ≈36 hours. We find that the X-ray flux is enhanced by about a factor of two compared to 2017, while the radio and millimetre core fluxes are consistent between 2017 and 2018. We detect evidence for a monotonically increasing jet position angle that corresponds to variations in the bright spot of the EHT image. Conclusions.Our results show the value of continued MWL monitoring together with precision imaging for addressing the origins of high-energy particle acceleration. While we cannot currently pinpoint the precise location where such acceleration takes place, the new VHEγ-ray flare already presents a challenge to simple one-zone leptonic emission model approaches, and it emphasises the need for combined image and spectral modelling. 
    more » « less